OcrV1, Main, Exploration, bibRecord, 000139

A framework for improved video text detection and recognition

Identifieur interne : 000139 ( Main/Exploration ); précédent : 000138; suivant : 000140

A framework for improved video text detection and recognition

Auteurs : HAOJIN YANG [Allemagne] ; Bernhard Quehl [Allemagne] ; Harald Sack [Allemagne]

Source :

Multimedia tools and applications [ 1380-7501 ] ; 2014.

RBID : Pascal:14-0217177

Descripteurs français

Pascal (Inist)
- Signal vidéo, Reconnaissance caractère, Texte, Reconnaissance forme, Recherche information, Traitement image, Indexation, Vision ordinateur, Bibliothèque électronique, Vidéothèque, Collecticiel, Workflow, Sémantique, Processus métier, Rappel, Taux fausse alarme, Classification à vaste marge, Localisation, ..
Wicri :
- topic : Vidéothèque.

English descriptors

KwdEn :
- Business process, Character recognition, Computer vision, Electronic library, False alarm rate, Groupware, Image processing, Indexing, Information retrieval, Localization, Pattern recognition, Recall, Semantics, Text, Vector support machine, Video library, Video signal, Workflow.

Abstract

Text displayed in a video is an essential part for the high-level semantic information of the video content. Therefore, video text can be used as a valuable source for automated video indexing in digital video libraries. In this paper, we propose a workflow for video text detection and recognition. In the text detection stage, we have developed a fast localization-verification scheme, in which an edge-based multi-scale text detector first identifies potential text candidates with high recall rate. Then, detected candidate text lines are refined by using an image entropy-based filter. Finally, Stroke Width Transform (SWT) - and Support Vector Machine (SVM)-based verification procedures are applied to eliminate the false alarms. For text recognition, we have developed a novel skeleton-based binarization method in order to separate text from complex backgrounds to make it processible for standard OCR (Optical Character Recognition) software. Operability and accuracy of proposed text detection and binarization methods have been evaluated by using publicly available test data sets.

Affiliations:

Allemagne

Links toward previous steps (curation, corpus...)

to stream PascalFrancis, to step Corpus: 000006
to stream PascalFrancis, to step Curation: 000759
to stream PascalFrancis, to step Checkpoint: 000022
to stream Main, to step Merge: 000140
to stream Main, to step Curation: 000139

Le document en format XML

<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" level="a">A framework for improved video text detection and recognition</title>
<author><name sortKey="Haojin Yang" sort="Haojin Yang" uniqKey="Haojin Yang" last="Haojin Yang">HAOJIN YANG</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Hasso-Plattner-Institute for IT-Systems Engineering, University of Potsdam, Prof.-Dr.-Helmert Str. 2-4</s1>
<s2>14467 Potsdam</s2>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<wicri:noRegion>14467 Potsdam</wicri:noRegion>
<wicri:noRegion>Prof.-Dr.-Helmert Str. 2-4</wicri:noRegion>
<wicri:noRegion>14467 Potsdam</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Quehl, Bernhard" sort="Quehl, Bernhard" uniqKey="Quehl B" first="Bernhard" last="Quehl">Bernhard Quehl</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Hasso-Plattner-Institute for IT-Systems Engineering, University of Potsdam, Prof.-Dr.-Helmert Str. 2-4</s1>
<s2>14467 Potsdam</s2>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<wicri:noRegion>14467 Potsdam</wicri:noRegion>
<wicri:noRegion>Prof.-Dr.-Helmert Str. 2-4</wicri:noRegion>
<wicri:noRegion>14467 Potsdam</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Sack, Harald" sort="Sack, Harald" uniqKey="Sack H" first="Harald" last="Sack">Harald Sack</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Hasso-Plattner-Institute for IT-Systems Engineering, University of Potsdam, Prof.-Dr.-Helmert Str. 2-4</s1>
<s2>14467 Potsdam</s2>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<wicri:noRegion>14467 Potsdam</wicri:noRegion>
<wicri:noRegion>Prof.-Dr.-Helmert Str. 2-4</wicri:noRegion>
<wicri:noRegion>14467 Potsdam</wicri:noRegion>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">INIST</idno>
<idno type="inist">14-0217177</idno>
<date when="2014">2014</date>
<idno type="stanalyst">PASCAL 14-0217177 INIST</idno>
<idno type="RBID">Pascal:14-0217177</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000006</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000759</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000022</idno>
<idno type="wicri:doubleKey">1380-7501:2014:Haojin Yang:a:framework:for</idno>
<idno type="wicri:Area/Main/Merge">000140</idno>
<idno type="wicri:Area/Main/Curation">000139</idno>
<idno type="wicri:Area/Main/Exploration">000139</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a">A framework for improved video text detection and recognition</title>
<author><name sortKey="Haojin Yang" sort="Haojin Yang" uniqKey="Haojin Yang" last="Haojin Yang">HAOJIN YANG</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Hasso-Plattner-Institute for IT-Systems Engineering, University of Potsdam, Prof.-Dr.-Helmert Str. 2-4</s1>
<s2>14467 Potsdam</s2>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<wicri:noRegion>14467 Potsdam</wicri:noRegion>
<wicri:noRegion>Prof.-Dr.-Helmert Str. 2-4</wicri:noRegion>
<wicri:noRegion>14467 Potsdam</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Quehl, Bernhard" sort="Quehl, Bernhard" uniqKey="Quehl B" first="Bernhard" last="Quehl">Bernhard Quehl</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Hasso-Plattner-Institute for IT-Systems Engineering, University of Potsdam, Prof.-Dr.-Helmert Str. 2-4</s1>
<s2>14467 Potsdam</s2>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<wicri:noRegion>14467 Potsdam</wicri:noRegion>
<wicri:noRegion>Prof.-Dr.-Helmert Str. 2-4</wicri:noRegion>
<wicri:noRegion>14467 Potsdam</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Sack, Harald" sort="Sack, Harald" uniqKey="Sack H" first="Harald" last="Sack">Harald Sack</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Hasso-Plattner-Institute for IT-Systems Engineering, University of Potsdam, Prof.-Dr.-Helmert Str. 2-4</s1>
<s2>14467 Potsdam</s2>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<wicri:noRegion>14467 Potsdam</wicri:noRegion>
<wicri:noRegion>Prof.-Dr.-Helmert Str. 2-4</wicri:noRegion>
<wicri:noRegion>14467 Potsdam</wicri:noRegion>
</affiliation>
</author>
</analytic>
<series><title level="j" type="main">Multimedia tools and applications</title>
<title level="j" type="abbreviated">Multimed. tools appl.</title>
<idno type="ISSN">1380-7501</idno>
<imprint><date when="2014">2014</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><title level="j" type="main">Multimedia tools and applications</title>
<title level="j" type="abbreviated">Multimed. tools appl.</title>
<idno type="ISSN">1380-7501</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Business process</term>
<term>Character recognition</term>
<term>Computer vision</term>
<term>Electronic library</term>
<term>False alarm rate</term>
<term>Groupware</term>
<term>Image processing</term>
<term>Indexing</term>
<term>Information retrieval</term>
<term>Localization</term>
<term>Pattern recognition</term>
<term>Recall</term>
<term>Semantics</term>
<term>Text</term>
<term>Vector support machine</term>
<term>Video library</term>
<term>Video signal</term>
<term>Workflow</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Signal vidéo</term>
<term>Reconnaissance caractère</term>
<term>Texte</term>
<term>Reconnaissance forme</term>
<term>Recherche information</term>
<term>Traitement image</term>
<term>Indexation</term>
<term>Vision ordinateur</term>
<term>Bibliothèque électronique</term>
<term>Vidéothèque</term>
<term>Collecticiel</term>
<term>Workflow</term>
<term>Sémantique</term>
<term>Processus métier</term>
<term>Rappel</term>
<term>Taux fausse alarme</term>
<term>Classification à vaste marge</term>
<term>Localisation</term>
<term>.</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr"><term>Vidéothèque</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Text displayed in a video is an essential part for the high-level semantic information of the video content. Therefore, video text can be used as a valuable source for automated video indexing in digital video libraries. In this paper, we propose a workflow for video text detection and recognition. In the text detection stage, we have developed a fast localization-verification scheme, in which an edge-based multi-scale text detector first identifies potential text candidates with high recall rate. Then, detected candidate text lines are refined by using an image entropy-based filter. Finally, Stroke Width Transform (SWT) - and Support Vector Machine (SVM)-based verification procedures are applied to eliminate the false alarms. For text recognition, we have developed a novel skeleton-based binarization method in order to separate text from complex backgrounds to make it processible for standard OCR (Optical Character Recognition) software. Operability and accuracy of proposed text detection and binarization methods have been evaluated by using publicly available test data sets.</div>
</front>
</TEI>
<affiliations><list><country><li>Allemagne</li>
</country>
</list>
<tree><country name="Allemagne"><noRegion><name sortKey="Haojin Yang" sort="Haojin Yang" uniqKey="Haojin Yang" last="Haojin Yang">HAOJIN YANG</name>
</noRegion>
<name sortKey="Quehl, Bernhard" sort="Quehl, Bernhard" uniqKey="Quehl B" first="Bernhard" last="Quehl">Bernhard Quehl</name>
<name sortKey="Sack, Harald" sort="Sack, Harald" uniqKey="Sack H" first="Harald" last="Sack">Harald Sack</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000139 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000139 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     Pascal:14-0217177
   |texte=   A framework for improved video text detection and recognition
}}

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024

	Serveur d'exploration sur l'OCR
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration sur l'OCR

A framework for improved video text detection and recognition

A framework for improved video text detection and recognition

Source :

Descripteurs français

English descriptors

Abstract

Links toward previous steps (curation, corpus...)

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri